PCNN: Projection Convolutional Neural Networks

53

as ω and ˆx are bilinear with each other as ωˆx[k]. In our discrete optimization framework,

the discrete values of convolutional kernels are updated according to their gradients. Taking

Eq. 3.36 into consideration, we derive the update rule for ˆx[k+1] as

ˆx[k+1] = ˆx[k] η ∂f(ω, ˆx[k])

ˆx[k]

= ˆx[k] ωηδ[k]

ˆx .

(3.37)

By plugging Eq. 3.37 into Eq. 3.35, we achieve a new objective function or a loss function

that minimizes

||ˆx[k+1] ωx||,

(3.38)

to approximate

ˆx = ωx, x = ω1 ˆx.

(3.39)

We further discuss multiple projections, based on Eq. 3.39 and projection loss in (3.34),

and have

min 1

2

J



j

||xω1

j

ˆxj||2.

(3.40)

We set g(x) = 1

2

J

j ||xω1

j

ˆxj||2 and calculate its derivative as g(x) = 0, and we have

x = 1

J

J



j

ω1

j

ˆxj,

(3.41)

which shows that multiple projections can better reconstruct the full kernels based on

binaries counterparts.

3.5.4

Projection Convolutional Neural Networks

PCNNs, shown in Fig. 3.12, work using DBPP for model quantization. We accomplish this

by reformulating our projection loss shown in (3.34) into the deep learning paradigm as

Lp = λ

2

L,I



l,i

J



j

|| ˆCl,[k]

i,j



W l,[k]

j

(Cl,[k]

i

+ ηδ ˆ

Cl,[k]

i,j )||2,

(3.42)

where Cl,[k]

i

, l ∈{1, ..., L}, i ∈{1, ..., I} denotes the ith kernel tensor of the lth convolutional

layer in the kth iteration. ˆCl,[k]

i,j

is the quantized kernel of Cl,[k]

i

via projection P l,j

Ω , j

{1, ..., J} as

ˆCl,[k]

i,j

= P l,j

Ω (

W l,[k]

j

, Cl,[k]

i

),

(3.43)

where 

W l,[k]

j

is a tensor, calculated by duplicating a learned projection matrix W l,[k]

j

along

the channels, which thus fits the dimension of Cl,[k]

i

. δ ˆ

Cl,[k]

i,j

is the gradient at ˆCl,[k]

i,j

calculated

based on LS, that is, δ ˆ

Cl,[k]

i,j

=

∂LS

ˆ

Cl,[k]

i,j . The iteration index [k] is omitted for simplicity.

In PCNNs, both the cross-entropy loss and projection loss are used to build the total

loss as

L = LS + LP .

(3.44)

The proposed projection loss regularizes the continuous values converging onto ΩN while

minimizing the cross-entropy loss, illustrated in Fig. 4.15 and Fig. 3.25.